AITopics

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Seunghoon Hong, Hyeonwoo Noh, Bohyung Han

Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation

Neural Information Processing SystemsOct-2-2025, 16:27:24 GMT

Neural Information Processing Systems http://nips.cc/

annotation, segmentation, segmentation network, (16 more...)

Country: Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Zhang, Weifan, Li, Tingguang, Liu, Yuzhen

MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding

arXiv.org Artificial IntelligenceAug-8-2025

Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs), enhanced with two human-inspired mechanisms: perspective-based active grounding, which dynamically adjusts the robot's viewpoint for improved visual inspection, and historical memory backtracking, which enables the system to retain and re-evaluate uncertain observations over time. Unlike existing approaches that passively rely on incidental visual inputs, our method actively optimizes perception and leverages memory to resolve ambiguity, significantly improving vision-language grounding in complex, unseen environments. Our framework operates in a zero-shot manner, achieving strong generalization to diverse and open-ended language descriptions without requiring labeled data or model fine-tuning. Experimental results on Habitat-Matterport 3D (HM3D) show that our method outperforms state-of-the-art approaches in language-driven object navigation. We further demonstrate its practicality through real-world deployment on a quadruped robot, achieving robust and effective navigation performance.

large language model, natural language, navigation, (16 more...)

2508.05021

Country: Asia > China (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.47)

Neural Information Processing SystemsJan-20-2025, 19:05:22 GMT

Reviews: Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images

The paper was clear and well written. The data set and the evaluation that was conducted could be useful to the community. However, the paper unfairly characterizes or omits some previous work, and was not clear enough about the limitations and biases of their evaluation strategy. These points detract from a paper that otherwise makes an interesting contribution. First, there is an implied criticism of WordSim-353 and MEN at the bottom of page 2 that they only contain similarity judgments at the word level. However, there is a large amount of work on learning phrase and sentence-level embeddings in the recently literature that overcome these issues (see representative work by Mirella Lapata, Marco Baroni, Stephen Clarke, Richard Socher, among many others), which the paper does not mention.

annotated image, limitation and bias, multimodal word embedding, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.77)
Information Technology > Artificial Intelligence > Natural Language (0.55)

Vödisch, Niclas, Petek, Kürsat, Käppeler, Markus, Valada, Abhinav, Burgard, Wolfram

A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation

arXiv.org Artificial IntelligenceMay-29-2024

A key challenge for the widespread application of learning-based models for robotic perception is to significantly reduce the required amount of annotated training data while achieving accurate predictions. This is essential not only to decrease operating costs but also to speed up deployment time. In this work, we address this challenge for PAnoptic SegmenTation with fEw Labels (PASTEL) by exploiting the groundwork paved by visual foundation models. We leverage descriptive image features from such a model to train two lightweight network heads for semantic segmentation and object boundary detection, using very few annotated training samples. We then merge their predictions via a novel fusion module that yields panoptic maps based on normalized cut. To further enhance the performance, we utilize self-training on unlabeled images selected by a feature-driven similarity scheme. We underline the relevance of our approach by employing PASTEL to important robot perception use cases from autonomous driving and agricultural robotics. In extensive experiments, we demonstrate that PASTEL significantly outperforms previous methods for label-efficient segmentation even when using fewer annotations. The code of our work is publicly available at http://pastel.cs.uni-freiburg.de.

panoptic segmentation, pastel, segmentation, (16 more...)

2405.19035

Country:

Europe > Germany > Baden-Württemberg > Freiburg (0.24)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)

Ghanbari, Alireza, Shirdel, Gholamhassan, Maleki, Farhad

Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head Segmentation

arXiv.org Artificial IntelligenceMay-12-2024

Precision agriculture involves the application of advanced technologies to improve agricultural productivity, efficiency, and profitability while minimizing waste and environmental impact. Deep learning approaches enable automated decision-making for many visual tasks. However, in the agricultural domain, variability in growth stages and environmental conditions, such as weather and lighting, presents significant challenges to developing deep learning-based techniques that generalize across different conditions. The resource-intensive nature of creating extensive annotated datasets that capture these variabilities further hinders the widespread adoption of these approaches. To tackle these issues, we introduce a semi-self-supervised domain adaptation technique based on deep convolutional neural networks with a probabilistic diffusion process, requiring minimal manual data annotation. Using only three manually annotated images and a selection of video clips from wheat fields, we generated a large-scale computationally annotated dataset of image-mask pairs and a large dataset of unannotated images extracted from video frames. We developed a two-branch convolutional encoder-decoder model architecture that uses both synthesized image-mask pairs and unannotated images, enabling effective adaptation to real images. The proposed model achieved a Dice score of 80.7\% on an internal test dataset and a Dice score of 64.8\% on an external test set, composed of images from five countries and spanning 18 domains, indicating its potential to develop generalizable solutions that could encourage the wider adoption of advanced technologies in agriculture.

dataset, international conference, semi-self-supervised domain adaptation, (14 more...)

2405.07157

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
Asia > Middle East > Iran > Qom Province > Qom (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.82)

Industry: Food & Agriculture > Agriculture (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsApr-10-2023, 14:41:24 GMT

Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation Hyeonwoo Noh

We propose a novel deep neural network architecture for semi-supervised semantic segmentation using heterogeneous annotations. Contrary to existing approaches posing semantic segmentation as a single task of region-based classification, our algorithm decouples classification and segmentation, and learns a separate network for each task. In this architecture, labels associated with an image are identified by classification network, and binary segmentation is subsequently performed for each identified label in segmentation network. The decoupled architecture enables us to learn classification and segmentation networks separately based on the training data with image-level and pixel-wise class labels, respectively. It facilitates to reduce search space for segmentation effectively by exploiting class-specific activation maps obtained from bridging layers. Our algorithm shows outstanding performance compared to other semi-supervised approaches with much less training images with strong annotations in PASCAL VOC dataset.

annotation, segmentation, segmentation network, (16 more...)

Country: Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

arXiv.org Artificial IntelligenceMar-13-2023

One-Shot Segmentation of Novel White Matter Tracts via Extensive Data Augmentation

Liu, Wan, Lu, Qi, Zhuo, ZhiZheng, Liu, Yaou, Ye, Chuyang

Deep learning based methods have achieved state-of-the-art performance for automated white matter (WM) tract segmentation. In these methods, the segmentation model needs to be trained with a large number of manually annotated scans, which can be accumulated throughout time. When novel WM tracts, i.e., tracts not included in the existing annotated WM tracts, are to be segmented, additional annotations of these novel WM tracts need to be collected. Since tract annotation is time-consuming and costly, it is desirable to make only a few annotations of novel WM tracts for training the segmentation model, and previous work has addressed this problem by transferring the knowledge learned for segmenting existing WM tracts to the segmentation of novel WM tracts. However, accurate segmentation of novel WM tracts can still be challenging in the one-shot setting, where only one scan is annotated for the novel WM tracts. In this work, we explore the problem of one-shot segmentation of novel WM tracts. Since in the one-shot setting the annotated training data is extremely scarce, based on the existing knowledge transfer framework, we propose to further perform extensive data augmentation for the single annotated scan, where synthetic annotated training data is produced. We have designed several different strategies that mask out regions in the single annotated scan for data augmentation. Our method was evaluated on public and in-house datasets. The experimental results show that our method improves the accuracy of one-shot segmentation of novel WM tracts.

artificial intelligence, deep learning, machine learning, (14 more...)

2303.06852

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.94)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Deleruyelle, Arnaud, Versari, Cristian, Klein, John

Self-mentoring: a new deep learning pipeline to train a self-supervised U-net for few-shot learning of bio-artificial capsule segmentation

arXiv.org Artificial IntelligenceJan-7-2023

Background: Accurate segmentation of microscopic structures such as bio-artificial capsules in microscopy imaging is a prerequisite to the computer-aided understanding of important biomechanical phenomenons. State-of-the-art segmentation performances are achieved by deep neural networks and related data-driven approaches. Training these networks from only a few annotated examples is challenging while producing manually annotated images that provide supervision is tedious. Method: Recently, self-supervision, i.e. designing a neural pipeline providing synthetic or indirect supervision, has proved to significantly increase generalization performances of models trained on few shots. The objective of this paper is to introduce one such neural pipeline in the context of micro-capsule image segmentation. Our method leverages the rather simple content of these images so that a trainee network can be mentored by a referee network which has been previously trained on synthetically generated pairs of corrupted/correct region masks. Results: Challenging experimental setups are investigated. They involve from only 3 to 10 annotated images along with moderately large amounts of unannotated images. In a bio-artificial capsule dataset, our approach consistently and drastically improves accuracy. We also show that the learnt referee network is transferable to another Glioblastoma cell dataset and that it can be efficiently coupled with data augmentation strategies. Conclusions: Experimental results show that very significant accuracy increments are obtained by the proposed pipeline, leading to the conclusion that the self-supervision mechanism introduced in this paper has the potential to replace human annotations.

artificial intelligence, machine learning, segmentation, (19 more...)

doi: 10.1016/j.compbiomed.2022.106454

2205.1084

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Oncology > Brain Cancer (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)